Search CORE

26 research outputs found

A dynamic framework based on local Zernike Moment and motion history image for facial expression recognition

Author: Fan Xijian
Tjahjadi Tardi
Publication venue: 'Elsevier BV'
Publication date: 01/04/2017
Field of study

A dynamic descriptor facilitates robust recognition of facial expressions in video sequences. The current two main approaches to the recognition are basic emotion recognition and recognition based on facial action coding system (FACS) action units. In this paper we focus on basic emotion recognition and propose a spatio-temporal feature based on local Zernike moment in the spatial domain using motion change frequency. We also design a dynamic feature comprising motion history image and entropy. To recognise a facial expression, a weighting strategy based on the latter feature and sub-division of the image frame is applied to the former to enhance the dynamic information of facial expression, and followed by the application of the classical support vector machine. Experiments on the CK+ and MMI datasets using leave-one-out cross validation scheme demonstrate that the integrated framework achieves a better performance than using individual descriptor separately. Compared with six state-of-arts methods, the proposed framework demonstrates a superior performance

Warwick Research Archives Portal Repository

A spatial-temporal framework based on histogram of gradients and optical flow for facial expression recognition in video sequences

Author: Fan Xijian
Tjahjadi Tardi
Publication venue: 'Elsevier BV'
Publication date: 01/11/2015
Field of study

Facial expression causes different parts of the facial region to change over time and thus dynamic descriptors are inherently more suitable than static descriptors for recognising facial expressions. In this paper, we extend the spatial pyramid histogram of gradients to spatio-temporal domain to give 3-dimensional facial features and integrate them with dense optical flow to give a spatio-temporal descriptor which extracts both the spatial and dynamic motion information of facial expressions. A multi-class support vector machine based classifier with one-to-one strategy is used to recognise facial expressions. Experiments on the CK+ and MMI datasets using leave-one-out cross validation scheme demonstrate that the integrated framework achieves a better performance than using individual descriptor separately. Compared with six state of the art methods, the proposed framework demonstrates a superior performance

Warwick Research Archives Portal Repository

Robust contactless pulse transit time estimation based on signal quality metric

Author: Fan Xijian
Tjahjadi Tardi
Publication venue: 'Elsevier BV'
Publication date: 01/09/2020
Field of study

The pulse transit time (PTT) can provide valuable insight into cardiovascular health, specifically regarding arterial stiffness and blood pressure. Traditionally, PTT is derived by calculating the time difference between two photoplethysmography (PPG) measurements, which require a set of body-worn sensors attached to the skin. Recently, remote photoplethysmography (rPPG) has been proposed as a contactless monitoring alternative. The main problem with rPPG based PTT estimation is that motion artifacts affect the shape of waveform leading to the shift or over-detected peaks, which decreases the accuracy of PTT. To overcome this problem, this paper presents a robust pulse-by-pulse PTT estimation framework using a signal quality metric. By exploiting the local temporal information and global periodic characteristics, the metric automatically assesses pulse quality of signal on a pulse-by-pulse basis, and calculates the probabilities of the pulse peak being the actual peak. Furthermore, in order to cope with over-detected and shift pulse peaks, Kalman filter complemented by the proposed signal quality metric is used to adaptively adjust the peaks based on the estimated probability. All the refined peaks are finally used for pulse-by-pulse PTT estimation. The experiment results are promising, suggesting that the proposed framework provides a robust and more accurate PTT estimation in real applications

Warwick Research Archives Portal Repository

Fusing dynamic deep learned features and handcrafted features for facial expression recognition

Author: Fan Xijian
Tjahjadi Tardi
Publication venue: 'Elsevier BV'
Publication date: 01/12/2019
Field of study

The automated recognition of facial expressions has been actively researched due to its wide-ranging applications. The recent advances in deep learning have improved the performance facial expression recognition (FER) methods. In this paper, we propose a framework that combines discriminative features learned using convolutional neural networks and handcrafted features that include shape- and appearance-based features to further improve the robustness and accuracy of FER. In addition, texture information is extracted from facial patches to enhance the discriminative power of the extracted textures. By encoding shape, appearance, and deep dynamic information, the proposed framework provides high performance and outperforms state-of-the-art FER methods on the CK+ dataset

Warwick Research Archives Portal Repository

Spatio-temporal framework on facial expression recognition.

Author: Fan Xijian
Publication venue
Publication date
Field of study

This thesis presents an investigation into two topics that are important in facial expression recognition: how to employ the dynamic information from facial expression image sequences and how to efficiently extract context and other relevant information of different facial regions. This involves the development of spatio-temporal frameworks for recognising facial expression. The thesis proposed three novel frameworks for recognising facial expression. The first framework uses sparse representation to extract features from patches of a face to improve the recognition performance, where part-based methods which are robust to image alignment are applied. In addition, the use of sparse representation reduces the dimensionality of features, and improves the semantic meaning and represents a face image more efficiently. Since a facial expression involves a dynamic process, and the process contains information that describes a facial expression more effectively, it is important to capture such dynamic information so as to recognise facial expressions over the entire video sequence. Thus, the second framework uses two types of dynamic information to enhance the recognition: a novel spatio-temporal descriptor based on PHOG (pyramid histogram of gradient) to represent changes in facial shape, and dense optical flow to estimate the movement (displacement) of facial landmarks. The framework views an image sequence as a spatio-temporal volume, and uses temporal information to represent the dynamic movement of facial landmarks associated with a facial expression. Specifically, spatial based descriptor representing spatial local shape is extended to spatio-temporal domain to capture the changes in local shape of facial sub-regions in the temporal dimension to give 3D facial component sub-regions of forehead, mouth, eyebrow and nose. The descriptor of optical flow is also employed to extract the information of temporal. The fusion of these two descriptors enhance the dynamic information and achieves better performance than the individual descriptors. The third framework also focuses on analysing the dynamics of facial expression sequences to represent spatial-temporal dynamic information (i.e., velocity). Two types of features are generated: a spatio-temporal shape representation to enhance the local spatial and dynamic information, and a dynamic appearance representation. In addition, an entropy-based method is introduced to provide spatial relationship of different parts of a face by computing the entropy value of different sub-regions of a face

Warwick Research Archives Portal Repository

A Segmentation-Guided Deep Learning Framework for Leaf Counting

Author: Choudhury Sruti Das
Fan Xijian
Tjahjadi Tardi
Ye Qiaolin
Zhou Rui
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 19/05/2022
Field of study

Deep learning-based methods have recently provided a means to rapidly and effectively extract various plant traits due to their powerful ability to depict a plant image across a variety of species and growth conditions. In this study, we focus on dealing with two fundamental tasks in plant phenotyping, i.e., plant segmentation and leaf counting, and propose a two-steam deep learning framework for segmenting plants and counting leaves with various size and shape from two-dimensional plant images. In the first stream, a multi-scale segmentation model using spatial pyramid is developed to extract leaves with different size and shape, where the fine-grained details of leaves are captured using deep feature extractor. In the second stream, a regression counting model is proposed to estimate the number of leaves without any pre-detection, where an auxiliary binary mask from segmentation stream is introduced to enhance the counting performance by effectively alleviating the influence of complex background. Extensive pot experiments are conducted CVPPP 2017 Leaf Counting Challenge dataset, which contains images of Arabidopsis and tobacco plants. The experimental results demonstrate that the proposed framework achieves a promising performance both in plant segmentation and leaf counting, providing a reference for the automatic analysis of plant phenotypes

DigitalCommons@University of Nebraska

Discriminative attention-augmented feature learning for facial expression recognition in the wild

Author: Das Choudhury Sruti
Fan Xijian
Tjahjadi Tardi
Zhou Linyi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Facial expression recognition (FER) in-the-wild is challenging due to unconstraint settings such as varying head poses, illumination, and occlusions. In addition, the performance of a FER system significantly degrades due to large intra-class variation and inter-class similarity of facial expressions in real-world scenarios. To mitigate these problems, we propose a novel approach, Discriminative Attention-augmented Feature Learning Convolution Neural Network (DAF-CNN), which learns discriminative expression-related representations for FER. Firstly, we develop a 3D attention mechanism for feature refinement which selectively focuses on attentive channel entries and salient spatial regions of a convolution neural network feature map. Moreover, a deep metric loss termed Triplet-Center (TC) loss is incorporated to further enhance the discriminative power of the deeply-learned features with an expression-similarity constraint. It simultaneously minimizes intra-class distance and maximizes inter-class distance to learn both compact and separate features. Extensive experiments have been conducted on two representative facial expression datasets (FER-2013 and SFEW 2.0) to demonstrate that DAF-CNN effectively captures discriminative feature representations and achieves competitive or even superior FER performance compared to state-of-the-art FER methods

Warwick Research Archives Portal Repository

TCDNet : tree crown detection from UAV optical images using uncertainty-aware one-stage network

Author: Fan Xijian
Qu Hongyu
Tjahjadi Tardi
Wu Weichao
Yang Xubing
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/10/2022
Field of study

Tree crown detection plays a vital role in forestry management, resource statistics and yields forecasting. RGB high-resolution aerial images have emerged as a cost-effective source of data for tree crown detection. To address the challenges in the detection using UAV optical images, we propose a one-stage object detection network, TCDNet. First, the network provides an attention enhancement feature extraction module to enable the model to distinguish between tree crowns and their complex backgrounds. Second, an efficient loss is introduced to enable it to be aware of the overlap between adjacent trees, thus effectively avoiding misdetection. The experimental results on two publicly available datasets show that the proposed network outperforms state-of-art networks in terms of precision, recall and mean average precision

Warwick Research Archives Portal Repository

Semi-supervised learning for forest fire segmentation using UAV imagery

Author: Fan Xijian
Tjahjadi Tardi
Wang Junling
Wang Yupeng
Yang Xubing
Publication venue: 'MDPI AG'
Publication date: 01/09/2022
Field of study

Unmanned aerial vehicles (UAVs) are an efficient tool for monitoring forest fire due to its advantages, e.g., cost-saving, lightweight, flexible, etc. Semantic segmentation can provide a model aircraft to rapidly and accurately determine the location of a forest fire. However, training a semantic segmentation model requires a large number of labeled images, which is labor-intensive and time-consuming to generate. To address the lack of labeled images, we propose, in this paper, a semi-supervised learning-based segmentation network, SemiFSNet. By taking into account the unique characteristics of UAV-acquired imagery of forest fire, the proposed method first uses occlusion-aware data augmentation for labeled data to increase the robustness of the trained model. In SemiFSNet, a dynamic encoder network replaces the ordinary convolution with dynamic convolution, thus enabling the learned feature to better represent the fire feature with varying size and shape. To mitigate the impact of complex scene background, we also propose a feature refinement module by integrating an attention mechanism to highlight the salient feature information, thus improving the performance of the segmentation network. Additionally, consistency regularization is introduced to exploit the rich information that unlabeled data contain, thus aiding the semi-supervised learning. To validate the effectiveness of the proposed method, extensive experiments were conducted on the Flame dataset and Corsican dataset. The experimental results show that the proposed model outperforms state-of-the-art methods and is competitive to its fully supervised learning counterpart

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Warwick Research Archives Portal Repository

A segmentation-guided deep learning framework for leaf counting

Author: Das Choudhury Sruti
Fan Xijian
Tjahjadi Tardi
Ye Qiaolin
Zhou Rui
Publication venue: 'Frontiers Media SA'
Publication date: 19/05/2022
Field of study

Deep learning-based methods have recently provided a means to rapidly and effectively extract various plant traits due to their powerful ability to depict a plant image across a variety of species and growth conditions. In this paper, we focus on dealing with two fundamental tasks in plant phenotyping, i.e., plant segmentation and leaf counting, and propose a two-steam deep learning framework for segmenting plants and counting leaves with various size and shape from two-dimensional plant images. In the first stream, a multi-scale segmentation model using spatial pyramid is developed to extract leaves with different size and shape, where the fine-grained details of leaves are captured using deep feature extractor. In the second stream, a regression counting model is proposed to estimate the number of leaves without any pre-detection, where an auxiliary binary mask from segmentation stream is introduced to enhance the counting performance by effectively alleviating the influence of complex background. Extensive pot experiments are conducted on the CVPPP 2017 Leaf Counting Challenge dataset, which contains images of Arabidopsis and tobacco plants. Experimental results demonstrate that the proposed framework achieves a promising performance both in plant segmentation and leaf counting, providing a reference for the automatic analysis of plant phenotypes

DigitalCommons@University of Nebraska

PubMed Central

Warwick Research Archives Portal Repository